We consider the problem of tracking an intruder using a network of wirelesssensors. For tracking the intruder at each instant, the optimal number and theright configuration of sensors has to be powered. As powering the sensorsconsumes energy, there is a trade off between accurately tracking the positionof the intruder at each instant and the energy consumption of sensors. Thisproblem has been formulated in the framework of Partially Observable MarkovDecision Process (POMDP). Even for the simplest model considered in [1], thecurse of dimensionality renders the problem intractable. We formulate thisproblem with a suitable state-action space in the framework of POMDP anddevelop a reinforcement learning algorithm utilising the Upper Confidence TreeSearch (UCT) method to mitigate the state-action space explosion. Throughsimulations, we illustrate that our algorithm scales well with the increasingstate and action space.
展开▼